Unsupervised Acquisition of Verb Subcategorization Frames from Shallow-Parsed Corpora
نویسندگان
چکیده
In this paper, we reported experiments of unsupervised automatic acquisition of Italian and English verb subcategorization frames (SCFs) from general and domain corpora. The proposed technique operates on syntactically shallow-parsed corpora on the basis of a limited number of search heuristics not relying on any previous lexico-syntactic knowledge about SCFs. Although preliminary, reported results are in line with state-of-the-art lexical acquisition systems. The issue of whether verbs sharing similar SCFs distributions happen to share similar semantic properties as well was also explored by clustering verbs that share frames with the same distribution using the Minimum Description Length Principle (MDL). First experiments in this direction were carried out on Italian verbs with encouraging results.
منابع مشابه
The Automatic Acquisition Of Frequencies Of Verb Subcategorization Frames From Tagged Corpora
We describe a mechanism for automatically acquiring verb subcategorization frames and their frequencies in a large corpus. A tagged corpus is first partially parsed to identify noun phrases and then a finear grammar is used to estimate the appropriate subcategorization frame for each verb token in the corpus. In an experiment involving the identification of six fixed subcategorization frames, o...
متن کاملLearning Automatic Acquisition of Subcategorization Frames Using Bayesian Inference and Support Vector Machines
Learning Bayesian Belief Networks (BBN) from corpora and Support Vector Machines (SVM) have been applied to the automatic acquisition of verb subcategorization frames for Modern Greek. We are incorporating minimal linguistic resources, i.e. basic morphological tagging and phrase chunking, to demonstrate that verb subcategorization, which is of great significance for developing robust natural la...
متن کاملCombining Bayesian and Support Vector Machines Learning to automatically complete Syntactical Information for HPSG-like Formalisms
Learning Bayesian Belief Networks (BBN) from corpora and incorporating the extracted inferring knowledge with a Support Vector Machines (SVM) classifier has been applied to the automatic acquisition of verb subcategorization frames for Modern Greek. We have made use of minimal linguistic resources, such as basic morphological tagging and phrase chunking, to demonstrate that verb subcategorizati...
متن کاملAFAST: An Automatic Frames Acquisition System
This paper describes an unsupervised strategy to acquire lexico-semantic frames (LSFs) of verbs from sentential parsed corpora (in syntactic level). The problems of acquiring LSFs consist of verb senses ambiguity, diversity of linguistic usages, and lack of completed frame slots in a single sentence. We propose an specific clustering technique based on the Minimum Description Length (MDL) princ...
متن کاملLearning Verb Subcategorization from Corpora: Counting Frame Subsets
We present some novel machine learning techniques for the identification of subcategorization information for verbs in Czech. We compare three different statistical techniques applied to this problem. We show how the learning algorithm can be used to discover previously unknown subcategorization frames from the Czech Prague Dependency Treebank. The algorithm can then be used to label dependents...
متن کاملذخیره در منابع من
با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید
عنوان ژورنال:
دوره شماره
صفحات -
تاریخ انتشار 2008